Distilling Evidence of Long-Range Direction-Specific Causal Cross-Talk in Molecular Evolution of Retro-Viral Genomes

نویسندگان

  • Ishanu Chattopadhyay
  • Hod Lipson
چکیده

Rapid molecular evolution in retroviruses potentially pose a hurdle to effective vaccine design. While the coding sequence for viral surface proteins seemingly mutate randomly from point to point, the necessity of conserved function dictates the often suspected existence of hidden correlations and longrange dependencies between non-colocated sequence positions. In this initial report, we present a fundamentally new approach to infer the direction-specific causal dependencies that underlie the sequence changes driving viral evolution. Using no prior knowledge of viral genomes, or expectations of known patterns, we show that our algorithm distills the network of causality flows, identifying key regions of immunological vulnerabilities. Such computationally identified vulnerabilities may open the door to new vaccine designs that highly mutable retroviruses such as HIV fail to evade. Motivation & Contribution Design of an effective vaccine for the Human Immunodeficiency Virus has eluded researchers for the better part of last two decades. Sophisticated strategies such as carbohydrate cloaking and shape shifting identification molecules, endows the virus with an unmatched ability to evade the host immune response. Perhaps more important to such active evasive maneuvers is the rapid evolution of the viral genome, brought about by its intrinsically high per-base mutation rate. Evolving roughly 13 million times faster to the human genome, the HIV surface proteins present a constantly moving target for the host adaptive immune defense which simply cannot keep up. Indeed, the genomic diversity of HIV within a single host is large enough to warrant treatment as a multi-species colony. However, surface proteins are not redundant; they play a crucial role in viral assembly, and must therefore conserve function. It has long been suspected that hidden patterns and correlations are buried in the seemingly random alterations of the genomic sequences, and non-colocated mutations might have incipient statistical dependencies. Reported techniques that attempt to determine this hidden structure have investigated simple correlations in mutational frequencies. While correlation analyses have had some success, no Copyright c © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. notion of directional causality is obtainable via such symmetric approaches. In this preliminary report, we present a fundamentally new approach designed to go beyond simple correlations, and infer direction-specific causal dependencies in mutational dynamics at locations separated by possibly hundreds of bases in the coding sequences. For a specific surface protein, we show that our analyses reveals key vulnerabilities, which may be potentially exploited for vaccine design. Relevant Work: Protein Sectors Proteins display a hierarchy of structural features at primary, secondary, and tertiary levels; an organization that guides our current understanding of their functional properties. In (Halabi et al. 2009), the authors used statistical analysis of correlated evolution between amino acids to reveal a structural organization distinct from this traditional hierarchy. The analysis, applied to S1A serine proteases, indicated a decomposition into three quasi-independent groups of correlated amino acids, termed “protein sectors.” Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. The authors in (Halabi et al. 2009) proposed that sectors represent a structural organization of proteins that reflects their evolutionary histories. From Correlated Evolution To Structure & Function A standard measure of importance of protein residues is sequence conservation the degree to which the frequency of amino acids at a given position deviates from random expectation in a well-sampled multiple sequence alignment of the protein family (Capra and Singh 2007; Ng and Henikoff 2006). The more unexpected the amino acid distribution at a position, the stronger the inference of evolutionary constraint and therefore of biological importance. However, protein structure and function also depend on the cooperative action of amino acids, indicating that amino acid distributions at positions cannot be taken as independent of one another (Lockless and Ranganathan 1999). Indeed, analyses of correlations have contributed to the identiDiscovery Informatics: Papers from the AAAI-14 Workshop

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Morphological and Molecular Identification of Botrytis Cinerea Causal Agent of Gray Mold in Rose Greenhouses in Centeral Regions of Iran

Botrytis cinerea is an important pathogen that causes diseases in ornamental crops. In presentresearch several greenhouses of roses located in central region of Iran were surveyed toidentify the Botrytis cinerea. A total of 80 isolates were collected from rose greenhouses incentral region of Iran. Morphological identification was based on characters such asconidiophore and conidial length. Acco...

متن کامل

I-14: Novel Concepts in Molecular Pathology May Open A New Era in Treatment of Clinical Varicocele

Background Despite the long history associated with varicocele, it remains one of the most controversial issues in the field of Andrology. The main base of this is our current understanding of the pathophysiology of this disease. This has hampered treatment and management of varicocele, especially regarding why, when and to whom varicocelectomy should be applied. The main molecular pathology of...

متن کامل

HLA-KIR Interactions and Immunity to Viral Infections

Host genetic factors play a central role in determining the clinical phenotype of human diseases. Association between two polymorphic loci in human genome, human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptors (KIRs), and genetically complex infectious disease, particularly those of viral etiology, have been historically elusive. Hence, defining the influence of genetic di...

متن کامل

A Long-term Casual Nexus between Stock Price and Dividends: Empirical Evidence from the Accepted Firms in Tehran Stock Exchange

this world; though all the discussions are focused on the causal relationships in allthe scientific arguments. One of the methods to study the designed causal relationshipsobjectively is Granger causality test. This paper aims to investigate the longtermcausal relationship between the stock price and dividends. The statisticalpopulation includes 180 active companies in Stock Exchange of Tehran ...

متن کامل

O20: The Benefits of Increased Physical Activity and Higher Cardiorespiratory Fitness in People Living with Mental Health Disorders, with Specific Emphasis on Anxiety Disorders

Evidence has been accumulating for some time regarding the reduced life-expectancy experienced by people living with a mental illness. In developed world settings this can involve a two-decade reduction in life expectancy, whilst in the developing world the gap may be as high as thirty years. Whilst genetic risk factors and suicide contribute to this ‘scandal of premature mortality’...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014